Python with Artificial Intelligence
- Introduction to the course
- Fundamentals of Programming
- Python for Data Science Introduction (2 hrs to 4 hrs)
- Python, Anaconda and relevant packages installations
- Why learn Python?
- Keywords and Identifiers
- Comments, Indentation, and Statements
- Variables and Datatypes in Python
- Standard Input and Output
- Operators
- Control flow: If...else
- Control flow: while loop
- Control flow: for loop
- Control flow: break and continue
- Python for Data Science: Data Structures (2 hrs)
- Lists
- Tuples
- Sets
- Dictionary
- Strings
- Python for Data Science: Functions (2 hrs)
- Introduction
- Types of function
- Function arguments
- Recursive functions
- Lambda functions
- Modules
- Packages
- File Handling
- Exception Handling
- Debugging Python
- Python for Data Science: Numpy (1 hr)
- Numpy Introduction
- Numerical operations on Numpy
- Python for Data Science: Matplotlib (1 hr)
- Getting started with Matplotlib
- Python for Data Science: Pandas (1 hr)
- Getting started with Pandas
- Data Frame basics
- Key Operations on Data Frames
- Python for Data Science: Computational Complexity (1 hr)
- Space and Time Complexity: Finding largest number in the list
- Binary search
- Find element common in two lists
- SQL
- Introduction to Database
- Why SQL?
- Execution of an SQL statement
- IMDB Dataset
- Installing MySQL
- Load IMDB data
- Use, Describe, Show table
- Select
- Limit, Offset
- Order By
- Distinct
- Where, Comparison Operators, NULL
- Logic Operators
- Aggregate Functions: COUNT, MIN, MAX, AVG, SUM
- Group By
- Having
- Order of Keywords
- Join and Natural Join
- Inner, Left, Right, and Outer Joins
- Sub Queries/Nested Queries/Inner Queries
- DML: INSERT
- DML: UPDATE, DELETE
- DML: CREATE,TABLE
- DDL: ALTER, ADD, MODIFY, DROP
- DDL: DROP TABLE, TRUNCATE, DELETE
- Data Control Language: GRANT, REVOKE
- Learning Resources
- Exploratory Data Analysis and Data Visualization
- Plotting for Exploratory Data Analysis (EDA)
- Introduction to Iris dataset and 2D scatter-plot
- 3D Scatter-plot
- Pair plots
- Limitations of Pair plots
- Histogram and introduction to PDF(Probability Density Function)
- Univariate analysis using PDF
- CDF(Cumulative distribution function)
- Variance, Standard Deviation
- Median
- Percentiles and Quantiles
- IQR(InterQuartile Range), MAD(Median Absolute Deviation)
- Box-plot with whiskers
- Violin plots
- Summarizing plots, Univariate, Bivariate, and Multivariate analysis
- Multivariate probability density, contour plot
- Probability and Statistics
- Introduction to Probability and Statistics
- Population & Sample
- Gaussian/Normal Distribution and its PDF(Probability Density Function)
- CDF(Cumulative Density Function) of Gaussian/Normal Distribution
- Symmetric distribution, Skewness, and Kurtosis
- Standard normal variate (z) and standardization
- Kernel density estimation
- Sampling distribution & Central Limit Theorem
- Q-Q Plot: Is a given random variable Gaussian distributed?
- How distributions are used?
- Chebyshev’s inequality
- Discrete and Continuous Uniform distributions
- How to randomly sample data points. [Uniform Distribution]
- Bernoulli and Binomial distribution
- Log-normal
- Power law distribution
- Box-Cox transform
- Application of Non-Gaussian Distributions?
- Co-variance
- Pearson Correlation Coefficient
- Spearman Rank Correlation Coefficient
- Correlation vs Causation
- How to use Correlations?
- Confidence Intervals(C.I) Introduction
- Computing confidence-interval has given the underlying distribution
- C.I for the mean of a normal random variable
- Confidence Interval using bootstrapping
- Hypothesis Testing methodology, Null-hypothesis, p-value
- Hypothesis testing intuition with coin toss example
- Resampling and permutation test
- K-S Test for the similarity of two distributions
- Code Snippet K-S Test
- Hypothesis Testing: another example
- Resampling and permutation test: another example
- How to use Hypothesis testing?
- Proportional Sampling
- Dimensionality reduction and Visualization
- What is dimensionality reduction?
- Row vector, and Column vector
- How to represent a dataset?
- How to represent a dataset as a Matrix
- Data preprocessing: Feature Normalization
- Mean of a data matrix
- Data preprocessing: Column Standardization
- Co-variance of a Data Matrix
- MNIST dataset (784 dimensional)
- Code to load MNIST data set
- Principal Component Analysis
- Why learn it.
- Geometric intuition
- Mathematical objective function
- Alternative formulation of PCA: distance minimization
- Eigenvalues and eigenvectors
- PCA for dimensionality reduction and visualization
- Visualize MNIST dataset
- Limitations of PCA
- Code example
- PCA for dimensionality reduction (not-visualization)
- T-distributed stochastic neighborhood embedding (t-SNE)
- What is t-SNE?
- Neighborhood of a point, Embedding
- Geometric intuition
- Crowding problem
- How to apply t-SNE and interpret its output (distill.pub)
- t-SNE on MNIST
- Code example
- Foundations of Machine Learning
- Classification and Regression Models: K-Nearest Neighbors
- Classification algorithms in various situations
- Performance measurement of models
- Naive Bayes
- Logistic Regression
- Linear Regression
- Solving optimization problems
- Machine Learning- II (Supervised Learning Models)
- Support Vector Machines (SVM)
- Decision Trees
- Ensemble Models
- Data Mining(Unsupervised Learning)
- Unsupervised learning/Clustering
- What is Clustering?
- Unsupervised learning
- Applications
- Metrics for Clustering
- K-Means: Geometric intuition, Centroids
- K-Means: Mathematical formulation: Objective function
- K-Means Algorithm
- How to initialize: K-Means++
- Failure cases/Limitations
- K-Medoids
- Determining the right K
- Time and Space complexity
- Hierarchical clustering Technique
- Agglomerative & Divisive, Dendrograms
- Agglomerative Clustering
- Proximity methods: Advantages and Limitations
- Time and Space Complexity
- Limitations of Hierarchical Clustering
- Code sample
- DBSCAN (Density based clustering)
- Density based clustering
- MinPts and Eps: Density
- Core, Border and Noise points
- Density edge and Density connected points
- DBSCAN Algorithm
- Hyper Parameters: MinPts and Eps
- Advantages and Limitations of DBSCAN
- Time and Space Complexity
- Code samples
- Case Study 1
- Case Study 2